Matrix Transpose on Meshes: Theory and Practice
نویسندگان
چکیده
Matrix transpose is a fundamental communication operation which is not dealt with optimally by general purpose routing schemes. For two dimensional meshes, the first optimal routing schedule is given. The strategy is simple enough to be implemented, but details of the available hardware are not favorable. However, alternative algorithms, designed along the same lines, give an improvement on the Intel Paragon.
منابع مشابه
Optimal Algorithm for Matrix Transpose on Wormhole-Switched Meshes
The mesh is an architecture that has many scientific applications, and matrix transpose is an important permutation frequently performed in various techniques involving systems of linear equations. In this paper, we present an optimal algorithm for performing matrix transpose on meshes that support wormhole switching. If N is even, our algorithm takes 2 2+ N communication steps to perform matri...
متن کاملAn accelerated gradient based iterative algorithm for solving systems of coupled generalized Sylvester-transpose matrix equations
In this paper, an accelerated gradient based iterative algorithm for solving systems of coupled generalized Sylvester-transpose matrix equations is proposed. The convergence analysis of the algorithm is investigated. We show that the proposed algorithm converges to the exact solution for any initial value under certain assumptions. Finally, some numerical examples are given to demons...
متن کاملSome Modifications to Calculate Regression Coefficients in Multiple Linear Regression
In a multiple linear regression model, there are instances where one has to update the regression parameters. In such models as new data become available, by adding one row to the design matrix, the least-squares estimates for the parameters must be updated to reflect the impact of the new data. We will modify two existing methods of calculating regression coefficients in multiple linear regres...
متن کاملAdaptive Matrix Transpose Algorithms for Distributed Multicore Processors
An adaptive parallel matrix transpose algorithm optimized for distributed multicore architectures running in a hybrid OpenMP/MPI configuration is presented. Significant boosts in speed are observed relative to the distributed transpose used in the state-of-the-art adaptive FFTW library. In some cases, a hybrid configuration allows one to reduce communication costs by reducing the number of MPI ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997